# Audio Classification

Felguk Suno Or People
Apache-2.0
This model is used to classify audio clips as either 'Suno' music or 'People' music.
Audio Classification Transformers Supports Multiple Languages
F
Felguk
58
1
Whisper Tiny Tel Tam Try1
Apache-2.0
A fine-tuned audio classification model based on openai/whisper-tiny, excelling in voice command datasets
Audio Classification Transformers
W
JasHugF
18
0
Ph Audio Classification V1
Apache-2.0
A fine-tuned audio classification model based on DistilHuBERT, achieving 100% accuracy on the evaluation set
Audio Classification Transformers
P
herbiel
272
0
Music Classifier
Audio classification model based on Wav2Vec2 for music genre recognition
Audio Classification Safetensors
M
gastonduault
478
2
Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan
Bsd-3-clause
This model is an audio classification model based on the AST architecture and fine-tuned on the GTZAN music classification dataset, achieving an accuracy of 89%
Audio Classification Transformers
A
eonrad
1
0
My Awesome Mind Model
Apache-2.0
An audio classification model fine-tuned on the minds14 dataset based on facebook/wav2vec2-base
Audio Classification Transformers
M
faaany
1
0
Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan
Bsd-3-clause
This model is a fine-tuned version of Audio Spectrogram Transformer (AST) on the GTZAN music classification dataset for audio classification tasks, achieving an accuracy of 88%.
Audio Classification Transformers
A
abnerh
2
0
Speech Emotion Recognition With Facebook Wav2vec2 Large Xlsr 53
Apache-2.0
A speech emotion recognition system fine-tuned on Wav2Vec2 Large XLSR-53 model, capable of identifying 7 common emotions
Audio Classification Transformers
S
firdhokk
66
0
AST ASVspoof5 Synthetic Voice Detection
Bsd-3-clause
A synthetic speech detection model fine-tuned based on MIT/ast-finetuned-audioset-10-10-0.4593, used to identify whether an audio is synthetic speech.
Audio Classification Transformers
A
MattyB95
281
0
Genrevim Music Detection DistilHuBERT
This model is a fine-tuned audio classification model based on DistilHuBERT, specifically designed to distinguish between music and non-music audio.
Audio Classification Transformers
G
MarekCech
61
0
Testv4
A 5-class audio classification model fine-tuned on the superb dataset based on the wav2vec pre-trained model
Audio Classification Transformers
T
anderloh
27
0
Wav2vec Base Crema Sentiment Analysis
Apache-2.0
A speech emotion analysis model fine-tuned based on facebook/wav2vec2-base, achieving 70.87% accuracy on the evaluation set
Audio Classification Transformers
W
Piyush2512
38
0
Wav2vec2 Base Finetuned Ks
Apache-2.0
An audio classification model fine-tuned on an audio folder dataset based on the wav2vec2-base model, achieving 99.82% accuracy on the validation set
Audio Classification Transformers
W
motheecreator
54
3
Violence Detect 44
Apache-2.0
An audio classification model fine-tuned from facebook/wav2vec2-base-960h for detecting violent sounds
Audio Classification Transformers
V
Hemg
28
0
My Awesome Mind Model
Apache-2.0
An audio classification model fine-tuned based on facebook/wav2vec2-base, achieving 58.92% accuracy on the evaluation set
Audio Classification Transformers
M
Krithika-p
15
0
Vit Base Patch16 1024 128.audiomae As2m Ft As20k
A Vision Transformer (ViT)-based audio processing model, pre-trained on AudioSet-2M using self-supervised masked autoencoder (MAE) method and fine-tuned on AudioSet-20k
Audio Classification
V
gaunernst
335
2
Wav2vec2 Base Music Speech Both Classification Finetuned Gtzan
Apache-2.0
Audio classification model based on wav2vec2 architecture, fine-tuned on the GTZAN dataset for music and speech classification tasks
Audio Classification Transformers
W
0bi0n3
15
1
Cat Dog Sounds Classification
Apache-2.0
A foundational speech recognition model based on the wav2vec 2.0 architecture, pre-trained on 960 hours of English speech data
Audio Classification Transformers
C
dima806
25
4
Musical Instrument Detection
Apache-2.0
A foundational speech recognition model based on the wav2vec 2.0 architecture, pre-trained on 960 hours of English speech data
Audio Classification Transformers
M
dima806
2,109
7
Classical Composer Classification New
An audio classification model based on facebook/wav2vec2-base-960h, capable of identifying the composer of classical music audio clips
Audio Classification Transformers
C
dima806
15
2
Distilhubert Finetuned Gtzan
Apache-2.0
This model is an audio classification model based on the DistilHuBERT architecture, fine-tuned on the GTZAN music genre classification dataset, achieving an accuracy of 89%.
Audio Classification Transformers
D
sandychoii
15
0
Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan
Bsd-3-clause
This is an audio classification model based on the AST (Audio Spectrogram Transformer) architecture, fine-tuned on the GTZAN music genre classification dataset.
Audio Classification Transformers
A
nomad-ai
15
0
Mert Base
MERT is an acoustic music understanding model based on self-supervised learning, using pseudo-labels provided by a teacher model for pre-training.
Audio Classification Transformers
M
yangwang825
26
0
Distilhubert Finetuned Gtzan
Apache-2.0
An audio classification model fine-tuned on the GTZAN music classification dataset based on distilhubert, achieving 89% accuracy
Audio Classification Transformers
D
VinayHajare
20
1
Wav2vec2 Base Finetuned Gtzan
Apache-2.0
This model is an audio classification model fine-tuned on the GTZAN dataset based on facebook/wav2vec2-base, primarily used for music genre classification tasks.
Audio Classification Transformers
W
wilson-wei
14
0
Wav2vec2 Base Music Speech Both Classification
Apache-2.0
An audio classification model fine-tuned based on facebook/wav2vec2-base for distinguishing between music and speech
Audio Classification Transformers
W
FerhatDk
20
0
Ast Finetuned Audioset 10 10 0.4593 Finetuned Gtzan
Bsd-3-clause
An audio classification model based on AST architecture, fine-tuned on the GTZAN dataset for music genre classification tasks
Audio Classification Transformers
A
vineetsharma
14
0
Whisper Tiny Finetuned Gtzan
Apache-2.0
An audio classification model fine-tuned on the GTZAN dataset based on openai/whisper-tiny, achieving 91% accuracy
Audio Classification Transformers
W
vineetsharma
17
0
Distilhubert Finetuned Gtzan
Apache-2.0
This model is an audio classification model fine-tuned on the GTZAN music classification dataset based on DistilHuBERT, primarily used for music genre classification tasks.
Audio Classification Transformers
D
susnato
14
0
Ast Finetuned Audioset 10 10 0.4593
Audio Spectrogram Transformer (AST) model fine-tuned on the AudioSet dataset for audio classification tasks
Audio Classification Transformers
A
Xenova
82
0
Wav2musicgenre
Apache-2.0
An audio classification model fine-tuned based on facebook/wav2vec2-base for music genre recognition
Audio Classification Transformers
W
ramonpzg
20
0
Voip Classification
Apache-2.0
A fine-tuned speech classification model based on facebook/wav2vec2-base for audio folder dataset classification tasks
Audio Classification Transformers
V
james-xie-rng
18
0
Neunit Ks Kangyuan0601
Apache-2.0
This model is a fine-tuned audio classification model based on facebook/wav2vec2-base on the superb dataset, achieving 99.87% accuracy on the evaluation set.
Audio Classification Transformers
N
SHENMU007
16
0
Neunit Ks 529
Apache-2.0
An audio classification model fine-tuned on the SUPERB dataset based on facebook/wav2vec2-base, achieving 99.98% accuracy
Audio Classification Transformers
N
SHENMU007
14
0
CREMA D Model
Apache-2.0
A speech emotion recognition model fine-tuned based on facebook/wav2vec2-base, achieving 73.22% accuracy on the evaluation set
Audio Classification Transformers
C
jdmartinev
21
0
Birds Model
Bird sound recognition model fine-tuned based on Microsoft's WavLM-Large model
Audio Classification Transformers
B
saadashraf
26
0
Bird Classification Model
Apache-2.0
An audio classification model fine-tuned based on facebook/wav2vec2-base for identifying bird sounds
Audio Classification Transformers
B
Saads
19
1
Astie Finetuned On Shemo
Bsd-3-clause
This model is a fine-tuned version of the AST model on the shEMO dataset, primarily used for speech emotion recognition tasks.
Audio Classification Transformers
A
minoosh
24
0
Ast Finetuned Audioset 10 10 0.4593 Finetuned Ie
Bsd-3-clause
This model is a fine-tuned audio classification model based on MIT/ast-finetuned-audioset-10-10-0.4593, achieving 60.76% accuracy on the evaluation set.
Audio Classification Transformers
A
minoosh
14
0
Audio Class Finetuned
Apache-2.0
This model is a fine-tuned audio classification model based on facebook/wav2vec2-base on the superb dataset, achieving an accuracy of 0.6578 on the evaluation set.
Audio Classification Transformers
A
Chemsseddine
20
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase